Distributed information-theoretic clustering

نویسندگان

چکیده

Abstract We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences $X^n$ and $Y^n$, respectively. The goal is to find rate-limited encodings $f(x^n)$ $g(z^n)$ that maximize mutual information $\textrm{I}(\,{f(X^n)};{g(Y^n)})/n$. discuss connections of this problem with hypothesis testing against independence, pattern recognition bottleneck method. Improving previous cardinality bounds for inner outer allows us thoroughly special case binary symmetric quantify gap between bound in case. Furthermore, we investigate multiple description (MD) extension CEO constraint. Surprisingly, MD-CEO permits tight single-letter characterization achievable region.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Theoretic Clustering

Clustering is one of the important topics in pattern recognition. Since only the structure of the data dictates the grouping (unsupervised learning), information theory is an obvious criteria to establish the clustering rule. This paper describes a novel valley seeking clustering algorithm using an information theoretic measure to estimate the cost of partitioning the data set. The information ...

متن کامل

Information Theoretic Hierarchical Clustering

Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of clusters is challenging. However, due to the conventional proximity measures recruited in these algorithms, they are only capable of detecting mass-shape clusters and encounter problems in identifying complex data structures. Here, w...

متن کامل

Demystifying Information-Theoretic Clustering

Greg Ver Steeg [email protected] Aram Galstyan [email protected] Fei Sha [email protected] Simon DeDeo [email protected] 1 Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA 90292, USA 2 University of Southern California, Los Angeles, CA 90089, USA 3 Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA 4 School of Informatics and Computing, Indiana University, 901 E 1...

متن کامل

Information Theoretic Pairwise Clustering

In this paper we develop an information-theoretic approach for pairwise clustering. The Laplacian of the pairwise similarity matrix can be used to define a Markov random walk on the data points. This view forms a probabilistic interpretation of spectral clustering methods. We utilize this probabilistic model to define a novel clustering cost function that is based on maximizing the mutual infor...

متن کامل

Nonparametric Information Theoretic Clustering Algorithm

In this paper we propose a novel clustering algorithm based on maximizing the mutual information between data points and clusters. Unlike previous methods, we neither assume the data are given in terms of distributions nor impose any parametric model on the within-cluster distribution. Instead, we utilize a non-parametric estimation of the average cluster entropies and search for a clustering t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information and Inference: A Journal of the IMA

سال: 2021

ISSN: ['2049-8772', '2049-8764']

DOI: https://doi.org/10.1093/imaiai/iaab007